Detecting polymorphic regions in Arabidopsis thaliana with resequencing microarrays.
نویسندگان
چکیده
Whole-genome oligonucleotide resequencing arrays have allowed the comprehensive discovery of single nucleotide polymorphisms (SNPs) in eukaryotic genomes of moderate to large size. With this technology, the detection rate for isolated SNPs is typically high. However, it is greatly reduced when other polymorphisms are located near a SNP as multiple mismatches inhibit hybridization to arrayed oligonucleotides. Contiguous tracts of suppressed hybridization therefore typify polymorphic regions (PRs) such as clusters of SNPs or deletions. We developed a machine learning method, designated margin-based prediction of polymorphic regions (mPPR), to predict PRs from resequencing array data. Conceptually similar to hidden Markov models, the method is trained with discriminative learning techniques related to support vector machines, and accurately identifies even very short polymorphic tracts (<10 bp). We applied this method to resequencing array data previously generated for the euchromatic genomes of 20 strains (accessions) of the best-characterized plant, Arabidopsis thaliana. Nonredundantly, 27% of the genome was included within the boundaries of PRs predicted at high specificity ( approximately 97%). The resulting data set provides a fine-scale view of polymorphic sequences in A. thaliana; patterns of polymorphism not apparent in SNP data were readily detected, especially for noncoding regions. Our predictions provide a valuable resource for evolutionary genetic and functional studies in A. thaliana, and our method is applicable to similar data sets in other species. More broadly, our computational approach can be applied to other segmentation tasks related to the analysis of genomic variation.
منابع مشابه
Global analysis of allele-specific expression in Arabidopsis thaliana.
Gene expression is a complex trait determined by various genetic and nongenetic factors. Among the genetic factors, allelic difference may play a critical role in gene regulation. In this study we globally dissected cis (allelic) and trans sources of genetic variation in F(1) hybrids between two Arabidopsis thaliana wild accessions, Columbia (Col) and Vancouver (Van), using a new high-density S...
متن کاملLow-coverage resequencing detects meiotic recombination pattern and features in tomato RILs
Traditional plant breeding relies on meiotic recombination for mixing of parental alleles to create novel allele combinations. Detailed analysis of recombination patterns in model organisms shows that recombination is tightly regulated within the genome, but frequencies vary extensively along chromosomes. Despite being a model organism for fruit developmental studies, high-resolution recombinat...
متن کاملYeast Two Hybrid cDNA Screening of Arabidopsis thaliana for SETH4 Protein Interaction
SETH4 coding sequence with 2013 bp is a member of gene family expressed in gametophytic tissues of Arabidopsis thaliana. This fragment was PCR amplified using Kod Hi Fi DNA polymerase enzyme. This fragment was cloned into pGBKT7 bate vector and transformed E. coli DH5? cells containing vector were selected on LB medium containing Kanamycin. Finally, pGBKT7-SETH4 bate was transformed into yeast ...
متن کاملPatterns of Polymorphism at the Self-Incompatibility Locus in 1,083 Arabidopsis thaliana Genomes
Although the transition to selfing in the model plant Arabidopsis thaliana involved the loss of the self-incompatibility (SI) system, it clearly did not occur due to the fixation of a single inactivating mutation at the locus determining the specificities of SI (the S-locus). At least three groups of divergent haplotypes (haplogroups), corresponding to ancient functional S-alleles, have been ma...
متن کاملNegative control of Strictisidine synthase like-7 gene on salt stress resistance in Arabidopsis thaliana
Strictosidine synthase-like (SSL) is a group of gene families in the Arabidopsis genome, which whose orthologues in other plants are key enzymes in mono-terpenoid indole-alkaloid biosynthesis pathway. The SSL7 is upregulated upon treatments of Arabidopsis plants with signaling molecules such as SA, methyl jasmonate and ethylene. To find the functional role of the gene, a T-DNA-mediated knockout...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Genome research
دوره 18 6 شماره
صفحات -
تاریخ انتشار 2008